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DETAILED ACTION 

1 . The amendment filed on 24 October 2005 have been received and entered. The applicant 
has cancelled claims 43-44, 46-60 and 62-64. Claims 1-42, 45 and 61 as amended are pending in 
the application. 



Double Patenting 

The nonstatutory double patenting rejection is based on a judicially created doctrine 
grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or 
improper timewise extension of the "right to exclude" granted by a patent and to prevent possible 
harassment by multiple assignees. See In re Goodman, 1 1 F.3d 1046, 29 USPQ2d 2010 (Fed. 
Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 
F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 
1970);and, In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969). 

A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) may be used to 
overcome an actual or provisional rejection based on a nonstatutory double patenting ground 
provided the conflicting application or patent is shown to be commonly owned with this 
application. See 37 CFR 1.130(b). 

Effective January 1, 1994, a registered attorney or agent of record may sign a terminal 
disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 
CFR 3.73(b). 



2. Claims 1-21, 35-41 and 45 are provisionally rejected under the judicially created doctrine 
of obviousness-type double patenting as being unpatentable over claims 1-17, 33-39 and 43 of 
copending Application No. 10/155,358. Although the conflicting claims are not identical, they 
are not patentably distinct from each other because both sets of claims deal with methods of 
processing a video directed to a sports match including identifying a plurality of segments of the 
video based upon an event, wherein the event is characterized by a start time and an end time, 
and creating a summarization of the sports video by including the plurality of segments. 
Although the start and end times in claims 1-21,35-41 and 45 of the present application is based 
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upon events specifically occurring for use in a video of a sumo match and the start and end times 
in claims 1-17, 33-39 and 43 of copending Application No. 10/155,358 is based upon the 
intended use of events for a video of a sports game, both applications deal with creating a video 
summary from identified segments of a video, based upon sports event with a start and end time. 
The segments identified in the claimed limitations can be performed by the user and applicable 
to any video subject matter. 

This is a provisional obviousness-type double patenting rejection because the conflicting 
claims have not in fact been patented. 

3. Claims 1-42 and 45 are provisionally rejected under the judicially created doctrine of 
obviousness-type double patenting as being unpatentable over claims 1-52, 60-63, 65-68 and 75 
of copending Application No. 09/933,862. Although the conflicting claims are not identical, 
they are not patentably distinct from each other because both sets of claims deal with methods of 
processing a video directed to a sports match including identifying a plurality of segments of the 
video based upon an event, wherein the event is characterized by a start time and an end time, 
and creating a summarization of the sports video by including the plurality of segments. 
Although the start and end times in claims 1-42 and 45 of the present application is based upon 
events occurring for use in a video of a sumo match and the start and end times in claims 1-52, 
60-63, 65-68 and 75 of copending Application No. 09/933,862 is based upon the intended use of 
events for a video of a football game, both applications deal with creating a video summary from 
identified segments of a video, based upon sports event with a start and end time. The segments 
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identified in the claimed limitations can be performed by the user and applicable to any video 
subject matter. 

This is a provisional obviousness-type double patenting rejection because the conflicting 
claims have not in fact been patented. 

Claim Rejections - 35 USC § 112 
The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the 
subject matter which the applicant regards as his invention. 

4. Claims 11, 18, 22, 27, 31, 34-36 and 39 are rejected under 35 U.S.C. 112, second 
paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject 
matter which applicant regards as the invention. 

• Claim 1 1 recites the limitation "the start" and u the end" in lines 4-5. There is 
insufficient antecedent basis for these limitations in the claim. 

• Claims 18, 22, 27, 31 and 34-35 recite the limitations "the start" in lines 3, 3, 3, 3, 
4 and 4 of the respective claims. There is insufficient antecedent basis for this 
limitation in the claims. 

• Claim 36 recites the limitations "the end" in line 4. There is insufficient 
antecedent basis for this limitation in the claim. 

• The term "sufficiently" in claim 39 is a relative term which renders the claim 
indefinite. The term "sufficiently" is not defined by the claim, the specification 
does not provide a standard for ascertaining the requisite degree, and one of 
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ordinary skill in the art would not be reasonably apprised of the scope of the 
invention. The term "sufficiently" is indefinite because one of ordinary skill in 
the art would not be able to determine what time duration would qualify as being 
"sufficiently" short. The specification and claims do not provide a standard for a 
"sufficiently short duration" and the determination of whether a duration would 
be "sufficiently short" enough would be subjective to the interpretation of a user. 

Claim Rejections - 35 USC § 101 
35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or 
any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and 
requirements of this title. 

5. Claims 1,7, 11, 18, 22, 27, 31, 35, 36, 39, 42 and 45 are rejected under 35 U.S.C. 101 
because the claimed invention is directed to a non-statutory abstract idea. The cited claims are 
directed to a video processing method, however, they fail to produce a practical application in 
the technological arts. For such subject matter to be statutory, the claimed process must be 
limited to a practical application of the abstract idea or mathematical algorithm in the 
technological arts. See Alappat, 33 F.3d at 1543, 31 USPQ2d at 1556-57 (quoting Diamond v. 
Diehr, 450 U.S. at 192, 209 USPQ at 10). See MPEP 2106. The claimed invention is directed 
solely to an abstract idea or to manipulation of abstract ideas, and does not produce a practical 
application in the technological arts. The steps recited in the claims, i.e. identify a plurality of 
segments of a video, and create a summarization of the video from the identified segments, can 
be done by a user in his mind, i.e. identify a plurality of segments of a video that the user 



Application/Control Number: 10/058,684 Page 6 

Art Unit: 2173 

watched by remembering a plurality of memorable segments of the video and combine the 
segments he remembers to create a mental summarization of the video; none of the claimed steps 
are required to be performed on or by a computer, or computer-related system, and amount to an 
abstract idea that does not produce a practical application in the technological arts. 

6. To expedite a complete examination of the instant application the claims rejected under 
35 U.S.C. 101 (nonstatutory) above are further rejected as set forth below in anticipation of 
applicant amending these claims to place them within the four statutory categories of invention. 

Claim Rejections - 35 USC § 103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

7. Claims 1-7, 9-17, 22-33, 36-37 and 61 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over "Indexing of Baseball Telecast for Content-based Video Retrieval", 
Kawashima et al. (hereinafter Kawashima). 

Referring to claim 1, Kawashima teaches a method of processing a video including 
baseball comprising: (a) identifying a plurality of segments of the video based upon an event, 
wherein the event is characterized by a start time based upon when the ball is put into play and 
an end time based upon when the ball is considered out of play, where each of the segments 
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includes a plurality of frames of the video (pp. 871-873, sections 1.1, 1.2, 2.1 and 2.2; e.g. the at 
bat event comprising of a start point in time slightly before the pitching and endpoint in time 
slightly after the catcher catches the ball if the ball is struck out and after the ball is thrown to a 
baseman if the ball is hit); and 

(b) creating a summarization of the video by including the plurality of segments, where 
the summarization includes fewer frames than the video (Abstract; pg. 872, section 1.2; i.e. the 
indexed video segments is a digest of the game or summary of the video, a.k.a. compressed 
play). 

Although Kawashima does not explicitly teach the start time based upon specific sumo 
events such as when the players line up to charge one another and an end time based upon 
specific sumo events such as when one of the players at least one of steps outside the ring and 
touches the ring surface with part of his body other than the shoes of his feet, Kawashima teaches 
the start time and end time based upon baseball events. Since both sumo wrestling and baseball 
belong to a class of sporting events modeled as a sequence of "plays" identified by a start time 
and end time, it would have been obvious to one of ordinary skill in the art to apply the video 
summarization of identified segments based upon baseball events taught by Kawashima to 
segments based upon other sporting events such as a sumo wrestling match. One would have 
been motivated to make such a combination in order to generate a compact representation of the 
content of video material, allowing easy browsing, filtering, indexing and retrieval, etc., enabling 
users only interested in the exciting highlights of a game to skip the long and often boring 
portions of watching a sporting match in its entirety. 
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In a similar manner, all subsequent claims dealing with teachings of limitations for use in 
a baseball game can be applied for use in any other sports games, including sumo wrestling. 

Referring to claim 2, Kawashima teaches a method of processing a video including 
baseball wherein the event is defined by the rules of baseball (pp. 871-873, sections 1.1-2.1.4; 
events such as scenes in which a batter was struck out or got a hit or a home run is defined by the 
rules of baseball using a spotting technique comprising a search of the minimal warp function by 
comparing input video sequence with pitching/batting model sequences). Similarly, it would 
have been obvious that the event can be defined by the rules of sumo. 

Referring to claims 3, 6 and 15, Kawashima teaches a method of processing a video 
including baseball wherein the start time is temporally proximate a baseball pitch (pg. 872, lines 
10-11). Similarly, it would have been obvious that the start time can be temporally proximate a 
sumo event, such as a charge of the two players, or includes a portion of the pre-bout 
ceremonies. 

Referring to claims 4, 5, 16 and 17, Kawashima teaches a method of processing a video 
including baseball wherein the end time is temporally proximate to the batter missing the ball 
with a bat (pg. 872, lines 12-15). Similarly, it would have been obvious that the end time can be 
temporally proximate a sumo event, such as stepping out of the ring or touching of the surface of 
the ring by a part of his body other than the soles of his feet. 

Referring to claim 7, Kawashima teaches a method of processing a video including 
baseball comprising identifying a plurality of segments of the video, where each of the segments 
includes a plurality of frames of the video, based upon a series of activities defined by the rules 
of baseball (pp. 871-873, sections 1.1, 1.2, 2.1 and 2.2; series of activities such as scenes in which 
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a batter was struck out or got a hit or a home run is defined by the rules of baseball using a 
spotting technique comprising a search of the minimal warp function by comparing input video 
sequence with pitching/batting model sequences) that could potentially result in at least one of a 
score, preventing a score, and creating a summarization of the video by including the plurality of 
segments where the summarization includes fewer frames than the video (Abstract; pg. 872, 
section 1.2; i.e. the indexed video segments is a digest of the game or summary of the video, 
a.k.a. compressed play). 

Although Kawashima does not explicitly teach the frames are based upon a series of 
activities defined by the rules of sumo, Kawashima teaches the frames are based upon a series of 
activities defined by the rules of baseball. As previously mentioned, it would have been obvious 
to one of ordinary skill in the art to apply the video summarization of identified segments based 
upon baseball events taught by Kawashima to segments based upon other sporting events such as 
a sumo wrestling match. One would have been motivated to make such a combination in order 
to generate a compact representation of the content of video material, allowing easy browsing, 
filtering, indexing and retrieval, etc., enabling users only interested in the exciting highlights of a 
game to skip the long and often boring portions of watching a sporting match in its entirety. 

Referring to claim 9, Kawashima teaches wherein the activities are determined based 
upon the color characteristics of the video (pp. 872- 873, section 2. 1.3,. activities are spotted by 
calculating the value from the count of pixels whose intensity change in successive frames are 
larger than a threshold wherein pixels are painted/colored to form an image produced on the 
screen). 
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Referring to claim 10, Kawashima teaches wherein the activities are determined based 
upon scene changes (pp. 872-873; section 1.1-2.1.4; wherein an activity such as an at bat activity 
is a period from a basic scene to the next basic scene). 

Referring to claim 1 1, Kawashima teaches a method of processing a video including 
baseball comprising: 

(a) identifying a plurality of segments of the video based upon detecting a play of the 
baseball game, wherein the identifying includes detecting the start of the play and detecting the 
end of the play, where each of the segments includes a plurality of frames of the video (pp. 871- 
873, sections 1.1, 1.2, 2.1 and 2.2; e.g. detecting the start of the play in which a batter was struck 
out or got a hit or a home run is defined by the rules of baseball using a spotting technique 
comprising a search of the minimal warp function by comprising input video sequence with 
pitching/batting model sequences); and 

(b) creating a summarization of the video by including the plurality of segments, where 
the summarization includes fewer frames than the video (Abstract; pg. 872, section 1.2, i.e. the 
indexed video segments is a digest of the game or summary of the video, a.k.a. compressed 
play). 

Although Kawashima does not explicitly teach the identified segments of the video being 
based upon detecting a play of a sumo match, Kawashima teaches the identified segments of the 
video being based upon detecting a play of a baseball game. As previously mentioned, it would 
have been obvious to one of ordinary skill in the art to apply the video summarization of 
identified segments based upon baseball events taught by Kawashima to segments based upon 
other sporting events such as a sumo wrestling match. One would have been motivated to make 
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such a combination in order to generate a compact representation of the content of video 
material, allowing easy browsing, filtering, indexing and retrieval, etc., enabling users only 
interested in the exciting highlights of a game to skip the long and often boring portions of 
watching a sporting match in its entirety. 

Referring to claim 12, Kawashima teaches a method of processing a video including 
baseball wherein the detecting the end of the play is based upon detecting the start of the play 
(pp. 872-873; section 1 . 1-2. 1 .4; wherein a play such as an at bat activity is a period from an end 
of a basic scene to the start of the next basic scene). 

Referring to claim 13, Kawashima teaches wherein the summarization identifies the 
plurality of segments of the video (pg. 872, section 1.2). 

Referring to claim 14, Kawashima teaches wherein the summarization is a summarized 
video comprising the plurality of segments excluding at least a portion of the video other than the 
plurality of segments (pg. 872, section 1.2). 

Referring to claim 22, Kawashima teaches a method of processing a video including 
baseball comprising: (a) identifying a plurality of segments of the video based upon an event, 
wherein the event is characterized by a start of a plurality of segments representing when the ball 
is put into play, where each of the segments includes a plurality of frames of the video (pp. 871- 
873, sections 1.1, 1.2, 2.1 and 2.2; e.g. the at bat event comprising of a start); and 

(b) creating a summarization of the video by including the plurality of segments, where 
the summarization includes fewer frames than the video (Abstract; pg. 872, section 1.2; i.e. the 
indexed video segments is a digest of the game or summary of the video, a.k.a. compressed 
play). 
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Although Kawashima does not explicitly teach the start based upon specific sumo events 
such as a pair of regions having a dominant color description representative of skin tone to 
represent two sumo wrestlers lined up to charge one another, Kawashima teaches the start of the 
plurality of identified segments based upon baseball events, such as scenes representative of two 
baseball batters, i.e. a change from one batter to another batter (pp. 871, section 1.1). Since both 
sumo wrestling and baseball belong to a class of sporting events modeled as a sequence of 
"plays" identified by a start and end, it would have been obvious to one of ordinary skill in the 
art to apply the video summarization of identified segments based upon baseball events taught by 
Kawashima to segments based.upon other sporting events such as a sumo wrestling match (in 
other words, it would have been obvious that the detected scenes can have a dominant color 
description representative of skin tone to represent two sumo wrestlers). One would have been 
motivated to make such a combination in order to generate a compact representation of the 
content of video material, allowing easy browsing, filtering, indexing and retrieval, etc., enabling 
users only interested in the exciting highlights of a game to skip the long and often boring 
portions of watching a sporting match in its entirety. 

Referring to claims 23-25, Kawashima teaches identifying the start of plurality of 
segments based upon a pair of regions having a dominant color description representative of an 
event. Applicant has not disclosed that varying the percentage values of the pair of regions 
included in the dominate color description provides an advantage, is used for a particular 
purpose, or solves a stated problem. One of ordinary skill in the art, furthermore, would have 
expected Applicant's invention to perform equally well with the dominant color description 
including any percent change in shading of a region of a video frame because limitations of 
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varying percentage values of the pair or regions included in the dominate color description are 
design choices that do not affect the functionality of the method of identifying a plurality of 
segments based upon a pair of regions having a dominant color description. Therefore, at the 
time the invention was made, it would have been obvious to one of ordinary skill in the art, to 
modify the dominant color descriptions taught by Li to include varying percentages of the pair of 
regions, such as 25 percent, 50 percent, or 75 percent to obtain the invention as specified in 
claims. One would have been motivated to make such a combination in order to provide users 
with a plurality of implementation preferences to choice from. 

Referring to claim 26, Kawashima, as modified, teaches identifying the start of plurality 
of segments based upon a pair of regions having a dominant color representative of an event. 
Applicant has not disclosed that the position of the pair of regions in a particular portion of the 
video provides an advantage, is used for a particular purpose, or solves a stated problem. One of 
ordinary skill in the art, furthermore, would have expected Applicant's invention to perform 
equally well with the pair of regions placed anywhere in the video, because limitations of 
positional placement of the pair of regions are design choices that do not affect the functionality 
of the method of identifying a plurality of segments based upon a pair of regions having a 
dominant color description. Therefore, at the time the invention was made, it would have been 
obvious to one of ordinary skill in the art, to modify the pair of regions, or skin color blobs 
taught by Li to be placed any where in the video, such as the lower portion to obtain the 
invention as specified in claims. One would have been motivated to make such a combination in 
order to provide users with a plurality of implementation preferences to choice from. 
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Referring to claim 27, Kawashima teaches a method of processing a video including 
baseball comprising: (a) identifying a plurality of segments of the video based upon an event, 
wherein the event is characterized by a start of a plurality of segments based upon when the ball 
is put into play, where each of the segments includes a plurality of frames of the video (pp. 871- 
873, sections 1.1, 1.2, 2.1 and 2.2; e.g. the at bat event comprising of a start); and 

(b) creating a summarization of the video by including the plurality of segments, where 
the summarization includes fewer frames than the video (Abstract; pg. 872, section 1.2; i.e. the 
indexed video segments is a digest of the game or summary of the video, a.k.a. compressed 
play). 

Although Kawashima does not explicitly teach the start based upon specific sumo events 
such as a pair of regions generally symmetric to each other with respect to a generally center 
column of a frame of the video representing when the players line up to charge one another, 
Kawashima teaches the start of the plurality of identified segments based upon baseball events. 
Since both sumo wrestling and baseball belong to a class of sporting events modeled as a 
sequence of "plays" identified by a start and end, it would have been obvious to one of ordinary 
skill in the art to apply the video summarization of identified segments based upon baseball 
events taught by Kawashima to segments based upon other sporting events such as a sumo 
wrestling match. One would have been motivated to make such a combination in order to 
generate a compact representation of the content of video material, allowing easy browsing, 
filtering, indexing and retrieval, etc., enabling users only interested in the exciting highlights of a 
game to skip the long and often boring portions of watching a sporting match in its entirety. 
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Referring to claims 29-30, Kawashima teaches identifying a plurality of segments a 
video, wherein the start of the plurality of segments is identified based upon a pair of regions 
generally symmetric to each other with respect to a generally center column of a frame of the 
video, where each of the segments includes a plurality of frames of the video. Applicant has not 
disclosed that varying the percentage values of the center column within the center of the frame 
provides an advantage, is used for a particular purpose, or solves a stated problem. One of 
ordinary skill in the art, furthermore, would have expected Applicant's invention to perform 
equally well with the symmetric blobs to be symmetric to each other with respect to the center 
column being any percentage within the center of the frame because limitations of the center 
column being within a varying percentage value within the center of the frame are design choices 
that do not affect the functionality of the method. Therefore, at the time the invention was made, 
it would have been obvious to one of ordinary skill in the art, to modify the center of the two 
symmetric blobs to be within any number of varying percentages of the center of the frame to 
obtain the invention as specified in the claims. One would have been motivated to make such a 
combination in order to provide users with a plurality of implementation preferences to choice 
from. 

Referring to claim 31, Kawashima teaches a method of processing a video including 
baseball comprising: 

(a) identifying a plurality of segments of the baseball video, wherein the start is based 
upon spatial regions corresponding to a baseball pitch (pg. 872, lines 10-11); wherein an activity 
such as an at bat activity is a period from a basic scene to the next basic scene); and 
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(b) creating a summarization of the video by including the plurality of segments, where 
the summarization includes fewer frames than the baseball video (Abstract: pg. 872, section 1.2; 
i.e. the indexed video segments is a digest of the game or summary of the video, a.k.a. 
compressed play). 

Although Kawashima does not explicitly teach the identified segments of the video being 
of a sumo video and the start of the segments based upon a pair of spatial regions that move 
toward one another, Kawashima teaches the identified segments of the video being of a baseball 
game video and the start of the plurality of segments being based upon spatial regions 
corresponding to a baseball pitch. As previously mentioned, it would have been obvious to one 
of ordinary skill in the art to apply the video summarization of identified segments based upon 
baseball events taught by Kawashima to segments based upon other sporting events such as a 
sumo wrestling match, i.e. regions that move toward one another representing a charge of two 
players. One would have been motivated to make such a combination in order to generate a 
compact representation of the content of video material, allowing easy browsing, filtering, 
indexing and retrieval, etc., enabling users only interested in the exciting highlights of a game to 
skip the long and often boring portions of watching a sporting match in its entirety. 

Referring to claims 28 and 32, Kawashima teaches finding scenes representative of two 
baseball batters, i.e. a change from one batter to another batter) (pp. 871, section 1.1). Similarly, 
it would have been obvious that the detected scenes can have a dominant color description 
representative of skin tone to represent two sumo wrestlers. 

Referring to claim 33, Kawashima teaches scenes with regions representing the collision 
of a bat with a ball, i.e. hitting a baseball with the player's bat (pp. 871, section 1). Similarly , it 
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would have been obvious that the detected scene regions could represent collision in a sumo 
match, such as the collision of two players. 

Referring to claim 36, Kawashima teaches a method of processing a video including 
baseball comprising: 

(a) identifying a plurality of segments of the baseball video, wherein the identifying for 
the end of at least one of the segments is based upon detecting a scene change, where each of the 
segments includes a plurality of frames of the video (pp. 871-873, sections 1.1, 1.2, 2.1 and 2.2; 
wherein an activity such as an at bat activity is a period from a basic scene to the next basic 
scene); and 

(b) creating a summarization of the video by including the plurality of segments, where 
the summarization includes fewer frames than the baseball video (Abstract: pg. 872, section 1.2; 
i.e. the indexed video segments is a digest of the game or summary of the video, a.k.a. 
compressed play). 

Although Kawashima does not explicitly teach the identified segments of the video being 
of a sumo video, Kawashima teaches the identified segments of the video being of a baseball 
game video. As previously mentioned, it would have been obvious to one of ordinary skill in the 
art to apply the video summarization of identified segments based upon baseball events taught by 
Kawashima to segments based upon other sporting events such as a sumo wrestling match. One 
would have been motivated to make such a combination in order to generate a compact 
representation of the content of video material, allowing easy browsing, filtering, indexing and 
retrieval, etc., enabling users only interested in the exciting highlights of a game to skip the long 
and often boring portions of watching a sporting match in its entirety. 
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Referring to claim 37, Kawashima teaches the scene change is based upon a threshold 
between at least two frames (pp.872, sections 2.1.1; detecting scenes by determining a similarity 
compared to a threshold level between a set of frames). 

Referring to claim 61, Kawashima teaches a method of processing a video including 
baseball comprising: 

(a) identifying a plurality of segments of video potentially having graphical text segments 
(detecting texts superimposed on the scenes) (pp. 871, section 1.1; pp. 872, section 2.1.2), 
wherein the detection of graphical text segments is identified based upon:, where each of the 
segments includes a plurality of frames of the video; and 

(b) creating a summarization of the video by including the plurality of segments, where 
the summarization includes fewer frames than the baseball video (Abstract: pg. 872, section 1.2; 
i.e. the indexed video segments is a digest of the game or summary of the video, a.k.a. 
compressed play). 

Although Kawashima does not explicitly teach the identified segments of the video being 
of a sumo video and the detection of graphical text based upon sumo events such as a pair of 
substantially white regions generally symmetric with respect to a vertical centerline of the image, 
the image free from other significantly substantially white areas, the white regions persist for a 
plurality of seconds, the white regions preceding the start of a play, representing the lining up of 
two sumo wrestlers before the start of a sumo match, Kawashima teaches the identified segments 
of the video being of a baseball game video and the identification of superimposed text based 
upon the beginning of an at-bat event. As previously mentioned, it would have been obvious to 
one of ordinary skill in the art to apply the video summarization of identified segments based 
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upon baseball events taught by Kawashima to segments based upon other sporting events such as 
a sumo wrestling match, i.e. segments representing the lining up of two sumo wrestlers before 
the start of a sumo match. One would have been motivated to make such a combination in order 
to generate a compact representation of the content of video material, allowing easy browsing, 
filtering, indexing and retrieval, etc., enabling users only interested in the exciting highlights of a 
game to skip the long and often boring portions of watching a sporting match in its entirety. 

8. Claim 8 is rejected under 35 U.S.C. 103(a) as being unpatentable over "Indexing of 
Baseball Telecast for Content-based Video Retrieval", Kawashima et al. ( hereinafter 
Kawashima), as applied to claim 7 above, and "Automatically Extracting Highlights for TV 
Baseball Programs", Rui et al. (hereinafter Rui). 

Referring to claim 8, Kawashima teaches all of the limitations as applied to claim 7 
above. In addition, Kawashima teaches a method of processing a video including baseball 
wherein the summarization of the plurality of segments comprises a plurality of segments within 
the video (pg. 872, section 1.2; the indexed video segments of the summarization of the plurality 
of segments is stored as a digest of the game). Kawashima fails to explicitly teach the 
summarization of the plurality of segments to be in the same temporal order as the plurality of 
segments within the video. Rui further teaches wherein the summarization of the plurality of 
segments is in the same temporal order as the plurality of segments within the video (Abstract; 
section 5.4; Introduction; a method of allowing users to watch just the highlights of the exciting 
portions instead of the whole game due to time constraints, i.e. highlights are extracted 
automatically so that viewing time can be reduced). Therefore, it would have been obvious to 
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one of ordinary skill in the art, having the teachings of Kawashima and Rui before him at the 
time the invention was made, to include Rui's method of processing a video including baseball 
wherein the summarization of the plurality of segments is in the same temporal order as the 
plurality of segments within the video to Kawashima's method of processing a video including 
baseball wherein the summarization of the plurality of segments comprises a plurality of 
segments within the video. One would have been motivated to make such combination so that 
the time in which sequential plays in a game is being viewed is reduced. 

9. Claim 18-21 and 34 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
"Indexing of Baseball Telecast for Content-based Video Retrieval", Kawashima et al. 
(hereinafter Kawashima), and Standridge et al. (hereinafter Standridge) U.S. Publication 
2002/0141619. 

Referring to claim 18, Kawashima teaches a method comprising identifying a plurality of 
segments of a video, wherein the start of the plurality of segments based upon an event, wherein 
the event is characterized by a start based upon when the ball is put into play and an end time 
based upon when the ball is considered out of play, where each of the segments includes a 
plurality of frames of the video (pp. 871-873, sections 1.1, 1.2, 2.1 and 2.2; e.g. the at bat event 
comprising of a start point in time slightly before the pitching and endpoint in time slightly after 
the catcher catches the ball if the ball is struck out and after the ball is thrown to a baseman if the 
ball is hit); and creating a summarization of the video by including the plurality of segments, 
where the summarization includes fewer frames than the video (Abstract; pg. 872, section 1.2; 
i.e. the indexed video segments is a digest of the game or summary of the video, a.k.a. 
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compressed play). However, Kawashima fails to explicitly teach a frame of the video having an 
upper spatial region being substantially darker than a lower spatial region of the frame. 
Standridge teaches a frame of the video having an upper spatial region being substantially darker 
than a lower spatial region of the frame (page 5, paragraph 0052; comparing frames of a video 
that has a region that is darker, i.e. a dark or black area, in color than another region, i.e. a light 
or white area). It would have been obvious to one of ordinary skill in the art, having the 
teachings of Kawashima and Standridge before him at the time the invention was made, to 
include Standridge' s method of identification of frame segments with regions of darker and 
lighter colors to Kawashima' s method of identification of video segments to produce a 
summarization. One would have been motivated to make such a combination in order to allow 
quick and easy distinguishing between regions of video frames that represent different objects. 

Referring to claims 19-21, Kawashima, as modified, teach spatial regions of the video 
frame comprising a color description of a light or white nature (page 5, paragraph 0052). 
Although Kawashima as modified does not explicitly teach the lower spatial region comprises, at 
least in part, a pair of regions having a dominant color description representative of skin color or 
stage color, Kawashima teaches the video frame regions comprising similar colors such white or 
light colors. It would have been obvious that the light or white video frame regions could be of a 
color that represents skin tone, in order to clearly identify and distinguish objects from persons in 
a video pertaining to a sporting event. 

Referring to claim 34, due to the similarity of this claim to the combination of limitations 
from claims 18-20, 27 and 31, this claims is therefore rejected for the reasons set forth above. 
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10. Claims 35 and 42 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
"Indexing of Baseball Telecast for Content-based Video Retrieval", Kawashima et al. 
(hereinafter Kawashima) and "Multimedia Content Analysis", Wang et al. (hereinafter Wang). 

Referring to claim 35, Kawashima teaches a method comprising: identifying a plurality 
of segments of the video based upon an event, wherein the identifying for at least one of the 
segments includes detecting the start of the segment based upon processing of a first single frame 
of the video, where each of the segments includes a plurality of frames of the video (pp. 871-873, 
sections 1.1, 1.2, 2.1 and 2.2); and creating a summarization of the video by including the 
plurality of segments, where the summarization includes fewer frames than the video (Abstract; 
pg. 872, section 1.2; i.e. the indexed video segments are a digest of the game or summary of the 
video, a.k.a. compressed play). Kawashima fails to explicitly teach verifying that the first single 
frame is an appropriate start of the segment based upon processing of another single frame 
temporally relevant to the first single frame. Wang teaches verifying that the first single frame is 
an appropriate start of the segment based upon processing of another single frame temporally 
relevant to the first single frame (pg. 22, left column, lines 1-8). Therefore, it would have been 
obvious to one of ordinary skill in the art, having the teachings of Kawashima and Wang before 
him at the time the invention was made, to include Wang's verifying that the first single frame is 
an appropriate start of the segment based upon processing of another single frame temporally 
relevant to the first single frame to Kawashima' s start of the segment based upon processing of 
another single frame temporally relevant to the first single frame. One would have been 
motivated to make such a combination in order to reduce errors in segmenting related scenes. 
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Referring to claim 42, Kawashima teaches a method comprising identifying a plurality of 
segments of the video wherein each of the segments includes a plurality of frames of the video 
(Abstract) and creating a summarization of the video by including the plurality of segments, 
where the summarization includes fewer frames than the video (Abstract; pg. 872, section 1.2; 
i.e. the indexed video segments is a digest of the game or summary of the video, a.k.a. 
compressed play). Kawashima fails to explicitly teach detecting a segment that has a temporally 
sufficiently short duration, or separating/removing the identified segment from a summarization. 
Wang teaches a method of processing a video comprising identifying a segment that has a 
temporally sufficiently short duration, or separating/removing the identified segment from a 
summarization (pg. 21, right column; pg. 29, right column; separation of interested video 
portions and commercials). Therefore, it would have been obvious to one of ordinary skill in the 
art, having the teachings of Kawashima and Wang before him at the time the invention was 
made, to include Wang's method of detecting segments that has a temporally sufficiently short 
duration and separating/removing the identified segment from a summarization to Kawashima' s 
method of detecting a play of the baseball game. One would have been motivated to make such 
a combination in order to provide users with additional criteria in content based video retrieval. 

11. Claims 38 is rejected under 35 U.S.C. 103(a) as being unpatentable over "Indexing of 
Baseball Telecast for Content-based Video Retrieval", Kawashima et al. (hereinafter 
Kawashima), as applied to claim 36 above, and "Performance Characterization of Video-Shot- 
Change Detection Methods", Gargi et al. (hereinafter Gargi). 
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Referring to claim 38, Kawashima teaches all the limitations as applied to claim 36 
above, and scene changes based upon a threshold level (pp.872, sections 2.1.1; detecting scenes 
by determining a similarity compared to a threshold level between a set of frames). However, 
Kawashima fails to explicitly teach the scene change based upon a gradual transition below a 
threshold level. Gargi teach detecting scene changes in a video similar to that of Gargi et al. In 
addition, Gargi further teach detecting shot changes based upon gradual transitions 
(Introduction). It would have been obvious to one of ordinary skill in the art, having the 
teachings of Kawashima and Gargi before him at the time the invention was made, to include 
Gargi' s method of detecting scene changes based upon gradual transitions to Kawashima' s 
method of detecting scene changes based upon a threshold level. One would have been 
motivated to make such a combination in order to provide users with an implementation 
preference. 

12. Claims 39-41 and 43-44 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
"Indexing of Baseball Telecast for Content-based Video Retrieval", Kawashima et al. 
(hereinafter Kawashima), "Multimedia Content Analysis", Wang et al. (hereinafter Wang) and 
"Automatically Extracting Highlights for TV Baseball Programs", Rui et al. (hereinafter Rui). 

Referring to claim 39, Kawashima teaches a method comprising identifying a plurality of 
segments of the video wherein each of the segments includes a plurality of frames of the video 
(Abstract) and creating a summarization of the video by including the plurality of segments, 
where the summarization includes fewer frames than the video (Abstract; pg. 872, section 1.2; 
i.e. the indexed video segments is a digest of the game or summary of the video, a.k.a. 
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compressed play). Kawashima fails to explicitly teach identifying a plurality of segments that 
are temporally separated by a sufficiently short duration and then connecting the identified 
plurality of segments. Wang teaches a method of processing a video comprising identifying a 
plurality of segments that are temporally separated by a sufficiently short duration and then 
connecting the identified plurality of segments (pg. 21, right column; pg. 29, right column; 
separation of interested video portions and commercials). Therefore, it would have been obvious 
to one of ordinary skill in the art at the time the invention was made, to include Wang's method 
of identifying a plurality of segments that are temporally separated by a sufficiently short 
duration to Kawashima' s method of detecting a play of the baseball game. One would have been 
motivated to make such a combination in order to provide users with additional criteria in 
content-based video retrieval. However, the modified Kawashima still does not explicitly teach 
connecting the identified plurality of segments. Rui teaches a method of processing a video 
comprising of connecting the identified segments so that the summary is in the same temporal 
order as the plurality of segments within the video (Abstract; section 5.4; Introduction; a method 
of allowing users to watch just the highlights of the exciting portions instead of the whole game 
due to time constraints, i.e. highlights are extracted automatically so that viewing time can be 
reduced). Therefore, it would have been obvious to one of ordinary skill in the art, having the 
teachings of modified Kawashima and Wang before him at the time the invention was made, to 
include Rui's method of processing a video including baseball comprising of connecting the 
identified plurality of segments to the modified Kawashima' s method of processing a video 
including baseball comprising of a plurality of segments within the video. One would have been 
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motivated to make such a combination so that the time in which sequential plays in a game is 
being viewed is reduced. 

Referring to claims 40-41, as best understood by the examiner, the modified Kawashima 
teaches a method, wherein the connecting includes discarding the frames of the video between 
the identified plurality of segments and wherein the connecting results in a single segment that 
includes the identified plurality of segments together with the frames of the video between the 
identified plurality of segments (Wang: pg. 21, right column; pg. 29, right column; separation of 
interested video portions and commercials; Rui: Abstract; section 5.4; Introduction; a method of 
allowing users to watch just the highlights of the exciting portions instead of the whole game due 
to time constraints, i.e. highlights are extracted automatically so that viewing time can be 
reduced). 

13. Claim 45 is rejected under 35 U.S.C. 103(a) as being unpatentable over "Indexing of 
Baseball Telecast for Content-based Video Retrieval", Kawashima et al. ( hereinafter 
Kawashima) and "Detection of Slow-Motion Replay Segments in Sports Video for Highlights 
Generation", Pan et al. (hereinafter Pan). 

Referring to claim 45, Kawashima teaches a method comprising identifying a plurality of 
segments of the video wherein each of the segments includes a play of baseball wherein the 
segments include full-speed plays and creating a summarization of the video by including the 
plurality of segments, where the summarization includes fewer frames than the video, where a 
user may select from the summarization including only full-speed plays (Abstract; pg. 872, 
section 1 .2; i.e. the indexed video segments is a digest of the game or summary of the video, 
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a.k.a. compressed play where users may select a full-speed play segment among the plurality of 
segments). Kawashima fails to explicitly teach segments that include slow motion plays of the 
full-speed plays and creating a summarization where a user may select from the summarization 
comprising only of slow motion plays. Pan teaches a method of process a video including 
baseball comprising identifying a plurality of segments of the video wherein each of the 
segments includes a play of baseball ("Introduction", left column) wherein the segments include 
slow motion plays of the full-speed plays ("Introduction", right column; in processing the video, 
slow motion plays of the full-speed plays and full-speed plays are identified) and users may 
select from the summarization comprising only of slow motion plays. It would have been 
obvious to one of ordinary skill in the art, having the teachings of Kawashima and Pan before 
him at the time the invention was made, to include Pan's segments that include slow-motion 
plays of the full-speed plays and creating a summarization where a user may select from the 
summarization comprising only of slow motion plays to Kawashima' s segments that include full- 
speed plays and creating a summarization where a user may select from the summarization 
comprising only of full-speed plays. One would have been motivated to make such a 
combination in order to provide users with the ability to capture inherently important events. 
Although Kawashima and Pan do not explicitly teach the identified segments of the video 
including a play of sumo, Kawashima and Pan teach the identified segments of the video 
including a play of baseball. As previously mentioned, it would have been obvious to one of 
ordinary skill in the art to apply the video summarization of identified segments based upon 
baseball events taught by Kawashima to segments based upon other sporting events such as a 
sumo wrestling match. One would have been motivated to make such a combination in order to 
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generate a compact representation of the content of video material, allowing easy browsing, 
filtering, indexing and retrieval, etc., enabling users only interested in the exciting highlights of a 
game to skip the long and often boring portions of watching a sporting match in its entirety. 

Response to Arguments 

14. Applicant's arguments filed 24 October 2005 have been fully considered but they are not 
persuasive: 

15. With respect to the 101 rejection, the applicant argues that while the identification of 
video segments could occur in one's mind, the summarization could not because the idea of the 
summarization would not actually include the identified plurality of segments of the video 
frames, but merely the memory of them. The examiner respectfully disagrees. The recited 
limitations of the rejected claims do not claim any technical features specific to video segment 
identification and the creation of a video summary from the identified segments that are required 
to be performed by a computer or computer-related apparatus. The claims merely recite that a 
plurality of segments from a video are identified based on certain criteria such as a start time or 
regions that move toward one another (i.e. two sumo wrestlers lined up across from each other 
and the start of a bout represented by the action of the two wrestlers charging each other), and 
creating a summary including those identified segments. The examiner respectfully argues that 
without any technical limitations specific to the identification of the segments and creation of the 
summary from the identified segments being physically performed on a computer apparatus, the 
identification and summary creation steps can be done by a user in his mind. For example, users 
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can watch a video of a sumo wrestling match and mentally note segments of the video, such as 
the two segments of when the players start to charge one another and when the players step 
outside the ring (as per claim 1); the users can then create a summary of the video he just 
watched by including, i.e. remembering only the two identified segments of the players starting 
to charge one another and the players stepping outside the ring; therefore, the mental summary 
the user just created include those two identified segments of the video. In view of the above 
arguments, the examiner respectfully maintains that the steps claimed by the rejected claims are 
not required to be done by a computer or computer-related system and is therefore an abstract 
idea that does not necessarily produce a practical application in the technological arts. 

1 6. The applicant argues that the summarization method of Kawashima would not work with 
sumo. The examiner respectfully disagrees. Sumo wrestling and baseball are both sports events 
that consists of a sequence of plays or events, which are further characterized by a start and an 
end. Although the specifics of the present application and the Kawashima reference are 
different, i.e. the present application is based upon events occurring for use in a video of a sumo 
match and Kawashima is based upon at-bat events occurring during a baseball game, both the 
present application and Kawashima disclose similar methods of identifying a plurality of 
segments of a sporting events based upon an starting event and creating a summarization from 
the identified segments. Since Kawashima's process of creating a video summary, i.e. identify 
segments based upon a start and end event relating to a sports game and creating the summary 
from the segments, is similar to the process of creating a video summary of the present 
application, it would have been obvious to apply the process of creating a video summary of a 
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baseball game to any video subject matter, especially any sporting event. As another example, as 
per claim 45, the user can watch a video that has a portion that was played in full speed and a 
portion that was played in slow motion; the user can then mentally note memorable segments of 
video from both the full speed and slow motion play of the video and mentally remember those 
noted segments to create a mental summary. Therefore, the examiner respectfully argues that 
Kawashima's method for video summarization could be reasonably applied to a sequence of 
plays relating to a sumo match instead of a sequence of plays relating to a baseball game. 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Ting Zhou whose telephone number is (571) 272-4058. The 
examiner can normally be reached on Monday - Friday 7:00 am - 4:30 pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, John Cabeca can be reached at (571) 272-4048. The fax phone number for the 
organization where this application or proceeding is assigned is (571) 273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 
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